CHAR vs VARCHAR in InnoDB: Impact on Storage, Fragmentation & Performance
Choosing between CHAR and VARCHAR in InnoDB significantly affects storage layout, row fragmentation, update performance, and buffer pool efficiency—especially when rows vary greatly in length or are frequently updated.
CHAR is fixed-length. MySQL pads values with spaces to make them the same size.
VARCHAR is variable-length, storing only the actual string plus 1–2 bytes of length metadata.
CHAR always allocates the same number of bytes regardless of actual content.
VARCHAR rows expand and shrink depending on stored value size.
InnoDB stores rows inside pages (16 KB by default).
Fixed-length CHAR columns lead to predictable, stable row size → lower chance of page splits.
Variable-length VARCHAR columns can cause rows to grow during updates → page-level fragmentation.
If a VARCHAR row outgrows its original page, InnoDB performs a page split, creating fragmentation and extra I/O.
CHAR reduces fragmentation because row size never changes.
VARCHAR increases fragmentation when updates increase value length.
Fragmentation leads to more scattered pages → slower full scans and worse buffer pool efficiency.
Over time, highly dynamic VARCHAR columns may require OPTIMIZE TABLE to defragment.
CHAR provides faster comparisons because the length is constant and no length-prefix decoding is needed.
VARCHAR requires length checks and may involve collation evaluation over varying byte sizes.
CHAR is faster for write-heavy workloads with stable-length data.
VARCHAR is more space-efficient for data that is highly variable in size.
Values have fixed or near-fixed length (e.g., SHA-1 hashes, country codes).
Frequent updates occur and predictable row size avoids fragmentation.
High read/write performance is needed with minimal page splits.
Data varies significantly in size (names, addresses, descriptions).
Storage efficiency is more important than avoiding fragmentation.
Row updates rarely change size significantly.
In summary: CHAR provides predictable performance at the cost of fixed storage, while VARCHAR saves storage but may cause fragmentation and page splits in write-heavy or variable-length datasets.